DOCUMENT RESUME 



ED 226 023. 



TM 830 037 



AUTHOR 
TITLE 
PUB DATE 
NOTE 



PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Kaiser, Javaid 

The Predictive Validity of GRE Aptitude Test. 
12 Nov 82 

22*,.; Paper presented at the Annual Meeting of the 
Rocky Mountain Research Association (Albuquerque, NM, 
November 12, 1982) . . 

Speeches/Conference Papers (150) — Reports - 
Research/Technical (143) 

MFOI/Pdbl Plus Postage. 

Admission Criteria; *College Entrance Examinations; 
Computer Science Education; Education Majors; Grade 
Prediction; *Graduate Study; Higher Education; 
Multiple Regression Analysis; *Predictive 
Measurement; *Predictive Validity 
*Graduate Record Examinations 



ABSTRACT 

The continued controversy concerning the predictive 
validity of the Graduate Record Examination (GRE) aptitude test and 
its influence on selection decisions, including admission and 
financial aid, has necessitated the establishment of local norms. The 
sample involved 407 University of Kansas education and computer 
science students. Information on GRE verbal (GRE-V) , GRE quantitative 
(GRE-Q) , GRE verbal and quantitative, undergraduate grade point 
average (UGPA), graduate grade point average (GGPA) , major field of 
study, sex, and year of enrollment were recorded. GGPA was selected 
as a criterion variable. The remaining variables were treated as 
predictors. Stepwise multiple regression was applied as a statistical 
tool to analyze the data. Data analysis revealed that the verbal 
score on GRE was the best single predictor of GGPA for education 
students. Based on zero-order correlations, computer science 
students' UGPA was a better predictor than GRE scores. The findings 
_ag*e^wTtTT~previous studies that the GRE-V is the best single 
predictor of GGPA for majors that are descriptive -in nature and GRE-Q 
is the best predictor for symbol-oriented disciplines. Tests for the 
equality of regression equations developed for the. two student groups 
found significant differences, suggesting separate selection 
procedures in the two departments, (Author/PN) 
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The Predictive Validity of GRE Aptitude Test 

i 

Like every measuring instrument,- the validity and reliability of 
GRE aptitude test is very crucial to its continued use in graduate 
school. In view of the need to validate the GRE,' Educational ■ Testing 
Service and many individual institutions and researchers have been at- 
tempting to uncover the strengths and weaknesses of this instrument for 
the last two decades. The results obtained from these studies vary sig- 
nificantly from one another. Most of the studies have serious 
methodological problems such as small sample size and restriction of 
range on both the predictors and the criterion. There is also a con- 
troversy about what variables to be considered as a true representation 
of students 1 performance at the graduate school. Though most of the 

stu<?,es used grade point average (GPA) as a dependent variable, criteria 

i 

likfe faculty ratings, time taken to complete the degree, attainment of 
degree, and the performance in departmental comprehensive examination 
have also been used in the past. * 

The studies that used GPA in graduate school (GGPA) as criterion 
ate Alexakas (1967), Borg (1963), Capps and Decosta (1957), Clark 
(1968), Conway (1955), Department of History, UCL (1963), Duff and Aukes 
(1966), Eckhoff (1966), Gorman (1953), Hackman*, Wiggins, and Bass 
(1970), Hanson (1970)-, Johnson and Thompson (1962),. Madaus and Walsh 
(1965), Newman (1968), Office of Educational Research (1963), Office of 
institutional Analysis (1966\, Office of Institutional Research and Ser- 
vice (1958), Olsen (1955), Roberts (1970) r Robinson (1957) , .Roscoe and 
Houston (1969), Shafer and Rosenfeld (1969), Sleeper (1961), Wallace 



(1952), White (1954), White (1967), and Williams, Harlow, and Grab 
(1970). The median r values of .24, .23, .33, and .31 were observed 
when GGPA was correlated with GRE veibal score (GRE-V), GRE quantitative 
score (GRE-Q), GRE total score (GRE-V+Q) and undergraduate GPA (UGPA), 
respectively based on 46, 43, 30, and 26 studies. The median R value of 
.45 based on 24 studies was obtained when GRE scores and UGPA were used 
together as a composite (Concard, Trisman, and Miller, 1977). 

Bergmann (1960) , Besco (1960) , Duff and Aukes (1966) , Harvey 
(1963), King and Besco (1960), Lannholm (i960), Law (I960), Michels 
(1966), Office of Educational Research ^1963), Olsen (1955), Robertson 

and Hall (1964), Robertson and Nielson (1961), Test Office, Sacramento 

\ 

State College (1969), Tully (1962), and Wallace (1952), were identified 
as studies that used faculty ratings as a criterion. The median r values 
of .31, .27, ,41, and .37 were obtained when faculty ratings were cor- 
related with GRE-V, GRE-Q, GRE-V+Q, pnd UGPA, respectively based on 27, 
25, 8, and 15 studies. (Concard et al., 1977). 

The median r values of .18, .26, and .14 were obtained when attain- 
ment of the Ph.D., the criterion, was correlated with GRE-V, GRE-Q, and 
GRE-V+Q, respectively. The studies that come under this category include 
Bensen (1958), Creager (1965), Departmental Memo, UCL (1970), Ewen 
(1969), Fleury and Coppelluzzo (1969), Harmon (1966), -Lannholm (i960), 
Roberts (1970), Rock (1974), Roscoe et al. (1969), Rupiper (1959), 
Voorhees (I960), and Williams, Harlow, and Grab (1970). The median mul- 
tiple R of .31 for GRE-V+Q and .40 for GRE-UGPA composite was obtained 
when attainment of Ph.D. was v the dependent variable in the studies. The 
median r coefficients in the range of .16 to .40 were obtained when time 
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taken to complete Ph.D. was used as the criterion and GRE scores were 
the predictors . 

Gorman (1953), Lorge (i960), Michael, Jones, and Gibson (1960), 
Michael, Jones, and others (1971), and Sistrunk (1961) used performance 
on departmental comprehensive examinations as the dependent variable. 
Concard, et al. (1977) reported median correlation of .42 and .27 for 
this group of studies. The median values are based on 5, and 2 studies, 
respectively. 

All the research work done to this point revealed that GRE-V had 
* 

the highest median r value. (.31) when faculty ratings were used as 
dependent variable and the lowest median r of .16 when time taken to 
complete the degree was the criterion. Likewise, ^the lowest median r 
(•23) for GRE-Q was obtained with GGPA as criterion, compared to a value 
of .27 obtained when faculty ratings were the dependent variable. The 
median r (.31) for GRE-V+Q was associated with attainment of Ph.D. as 
the criterion while the highest value (.41) was obtained with faculty 
ratings. UGPA gave the highest median r (.37) with faculty ratings and 
the lowest value (.14) with the attainment of Ph.D. The range of mul- 

4 

tiple R for GRE-UGPA composite was .40 to .45. The maximum value was 
associated when GGPA was u*ed as criterion. 

Though the overall results support the use of faculty ratings as an 
effective criterion they, in fact, are subjective amt unreliable. 
Departmental examinations lack generalizability over departments even 
within the same institution. Attainment of degree, time taken to com- 
plete the degree and like* criteria are unsuitable because they are in- 
fluenced as much or more by factors such as motivation, per&istance. 



work condition?*, and financial status as academic ability. Grade point 
average has also been criticized because of its limited range. Graduate 
students generally represent a highly select group with respect to 
academic ability and past performance. By the time they are admitted to 
the graduate school, further restriction of range is introduced. It 
deflates the obtained coefficients and make them look lower than the 
national norms (ETS, 1978; Wilson, 1977). Moreover, the grades of one 
institution may not match the grades of another institution in terms of 
expertise and skill required' Many educationists doubt that even 
reliable grades can represent the most important outcomes of education. 
Though no single criterion is completely satisfactory, the use, of 
sevlral criteria may represent a satisfactory compromise (Concard, et 
all, 1977). 

In terms of the best predictors available, the general consensus is 
that GRE-V is the best single predictor of GGPA for majors that are 
descriptive in nature and GRE-Q is the best predictor for symbol 
oriented disciplines. GRE scores and .UGPA is considered the best 
possible composite of multiple predictors. Letters of recommendations 
are not considered a reliable predictor because they are subjective, 
generally biased, and hard to quantify. As far as overall performance of 
the test is concerned, it is considered a good predictor of graduate 
school performance for the majority group. However, serious doubts exist 
about its possible bias. towards ethnic minorities. 




Purpose of the Study 

The continued controversy about the predictive validity of GRE 'ap- 
titude test and its overwhelming influence on selection decesions in- 
cluding admission and financial aid has necessitated the establishment 
of local norms (Cronbach, 1971; Willingham, 1976). The present study was 
therefore, designed (1) to investigate the justification of continued 
use of the instrument for education and computer science students at the 
University of Kansas and (2) to develop norms for use of local officials 
who classify students on the basis of GRE scores. The study involved (1) 
the identification of the best possible set of, predictors of student 
performance in the graduate school, (2) the development of separate 
regression equations for education and computer science groups, and (3) 
the testing of the equality of regression equations, so developed. 

Procedure 

All the currently active students, enrolled in the School of Educa- 
tion and at the Department of Computer Science whose GRE scores were 
available, were included in this study. This strategy resulted in 356 
education and 51 computer science students. The total sample size was 
407. The students who were denied admission by the two departments 
under Consideration could not be included in this study due to non- 
availability of desired information. Information on GRE-V, GRE-Q, 
GRE-V+Q, UGPA, GGPA, major field of study (Major), sex, and Y'ar of ini- 
tial enrollment (Year) were, however, recorded on each subject. Major 
had two levels: (1) education and (2) computer science; sex had two 
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categories: (1) male, and (2) female;, and Year had four levels: (1) 1974 

or earlier, (2) 1975, (3) 1976, and*(4) 1977 or later. GGPA was selected 

# 

as a criterion variable in spite of'its restriction of range limita- 
tions. The rest of the variables included in this study, ,were treated as 
predictors. Stepwise multiple regression was applied as a statistical 
tool to analyze the data. ninal variables like Major, Sex, and Year 
were dummy coded into (n-1) independent vectors before their inclusion 
into the regression equation. Here, n refers to the number of levels in 
a variable (Kerlinger, 1977), The stability of obtained R was determined 
at each step in the development of the regression equation, by computing 
the values of shrunken R by Lord and Nicholson formula (Nicholson, 
1960). The statistics reported by Concard et al. (1977) on 34,443 in- 
dividuals who took the GRE test between October, 1974 and June, 1976 and 
intended to take education as their major was included in this study as 
a reference group. The statistics on this group was considered as 
national norms because they were representative of the entire country. A 
similar reference group was selected for computer science students and 
the statistics on it was treated as national norms. This reference group 
consisted of ^922 individuals who int^ded to take computer science as 
major in graduate studies (Concard, et al., 1977). 

Results and Discussion 

Table 1 shows the means and standard deviations computed on each 
variable and their interrelationship for the education and computer 
science groups. It was observed that the mean performance of education 
students was higher on both the GRE-V and the GRE-Q than the national 



norms (V=473.8; Q=473.1). However, the standard deviations of the sample 
were lower than the national norms (V=107; Q=ll6). This was to be ex- 
pected because some of the low scoring individuals were denied admission 

4 

and therefore, were not in the sample while the national group was un- 
restricted in this context. The education group for this study was, 
therefore, above average and more homogeneous than the national sample. 
The «2Sie was true for computer science group when compared with its 
reference group. The means on both GRE-V and GRE-Q were higher compared 
to national norms (V=523; Q=669). The standard deviations on the two 
scores were lower for this group compared to national norms (S =128; S 
=100). This again, can be attributed to the selection effect. 



Insert Table 1 here 



The correlation coefficients between the predictors and the 
criterion, "obtained in this study were compared with the median r 
values. These median values were computed from the results of GRE 
validity studies completed prior to 1972 (Concard, et al., 1977). The 
correlation matrix of the present study revealed that the verbal scores 
were significantly correlated with GGPA (p <.05) for the education 
group. This coefficient was, however, lower than the median coefficient 
of .36 that was based on 15 studies, completed on education students. 
The median r coefficient obtained from 46 studies, completed on a 
variety of disciplines was .24. The correlation between GRE-Q and GGPA 
for education group was also lower than the median r coefficient of .28 
obtained from 14 studies conducted on education students. The median 



correlation between GRE-Q and GGPA obtained 'from 43 studies conducted on 
several fields of study was .23 and was found higher than that obtained 
in the present study. The median coefficient for UGPA for education 
students was .30 and was based on 5 students. A total of 15 studies that 
used .different majors but UGPA as predictor lead to a median r of .37, 
The r values obtained for UGPA in the present study wa^ lower than both 
the median values. For GRE-UGPA weighted composite, 7 studies were com- 
pleted on education and a median R of .42 was obtained which is substan- 
tially higher than the value obtained in this study. 

There was no summarized data available in the form of median 
validity coefficients for the computer science students. Therefore, the 
results obtained for this group in the present study, were compared with 



the median values obtained from studids that included engineering and 

( 

applied science students as subjects. The computer science students in 
this study were found having lower r values than the \median coefficients 
on GRE-V, GRE-Q, UGPA, and GRE-UGPA weighted composite. The median < 
values for these predictors were .29, .31, .18, and .42, The first two 
values were based on 11 and 10 studies respectively and the last two 
were based on 4 studies each. 

Though the correlation coefficients obtained' in this study were 
lower than the median values on all the predictors included in this 
study, they were found to be stable than the coefficients used to deter-- 
mine such median values. The median r's were inflated due to inflated 
coefficients used to determine such values. Inflated valuds in in- 
dividual studies were, however, caused by small sample size. The studies 
used in computing median r values had a median sample size of 30 and the 

-9- 

J.f 



lowest sample size in such studies was 20. 

After this preliminary examination, the two sets of regression 
equations were developed for each group. The first one included the 
GRE-V, GRE-Q, UGPA, sex, and year of enrollment as independent 
variables. In the second equation, GRE-V+Q was substituted for GRE-V and 
GRE-Q while the other predictors were unchanged. The order of inclusion 
of predictors was also the same in both situations. The analysis 
revealed that the GRE-V contributed most to predicting GGPA (p <.01) for 
the education group. The unique contribution of the GRE-Q over and above 
GRE-V was non-significant (p >.05). Slight increments in R were produced 
by UGPA and Year, but their unique contribution vas insignificant (p 
>.05) over and above the predictors that were already in the equation at 
the time of their insertions. The impact of sex over and above GRE 
scores and UGPA was insignificant (p >.05). Similar results were ob- 
tained for the second set of analysis ^for the same group when GRE-V+Q 
was substituted for GRE-V and GRE-Q. An interesting result was that the 
overall predictability dropped slightly with the second equation 
discouraging the use of t^otal scores on GRE as a substitute for GRE 
subtest scores. This finding .also suggested that assigning equal 
weights to verbal and quantitative scores is not desirable. The GRE-UGPA 
weighted composite produced multiple correlation of .23 which was much 
lower than the median R (.45) obtained from 24 studies, completed 
between 1952 and 1972. The overall multiple correlation obtained for the 
education group was .28. \ 

\ 

\ 

\ 

\ . 
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Insert Table 2 here 



Tbe values of "the shrunken R represent the multiple correlation 

■ * 
coefficient one would expect if the study were replicated on another 

4 

sample and the regression equations developed in this study were used to 
predict the criterion. These estimated coefficients listed in Table 2 

support the findings obtained in this study. 

\ 

In spite of low correlation coefficients, it was apparent that the 
GRE-V is the best single predictor of GGPA for education students and 
that the GRE-V and UGPA form the best single composite, to predict the 
criterion. This finding supports the earlier findings that conclude that 
GRE-V is the best single predictor of GGPA for disciplines that are 
descriptive in nature (Lannholm, 1972; Concard, et al., 1977). 

For the computer science group, the . order of inclusion of the 
predictors into the\^regression equation was changed from that for the 
education group because \he correlation matrix of this group suggested 
the insertion of UGPA as thte first predictor. GRE-V and GRE-Q wer^ added 
next, but GRE-V could not meetXjxe tolerance level and appeared as^ the 
last predictor in the equation. The predictors are listed in Table 2 in 
the order they appeared in the equation. In spite of high multiple cor- 
relation coefficient, none of the predictors contributed significantly 
to predicting the criterion (p >*05). The overall R was not significant 
(p >.05) at any step. High, but insignificant R values might be the 
result of small sample size (n-51). However, the correlation matrix for 
this group and multiple correlations obtained suggested that UGPA is the 




best single predictor of GGPA for computer science students. In terms 
of multiple predictors, UGPA and GRE-Q would be the preferred composite. 
The value* of the shrunken R supported the statement that UGPA and GRE-Q 
were the only potential predictors for this group. 

The inference drawn earlier that the use of the GRE-V+Q lowers the 
overall predictabilit> of GGPA was also found true fot the computer 
science group. The GRE-V was the least significant predictor of GGPA 
for the computer science group while GRE-Q was the least significant 
predictor for the education group. This finding, was supported by 
previous studies that concluded that the GRE-V is a good predictor for 
majors of descriptive nature and that GRE-Q is more suitable fot: symbol 
oriented disciplines (Concard, et al., 1977). 

When the regression equations developed for the education and com- 
puter science groups were tested for equality, significant differences 
were found (P=4.421; df=4, 399; p =,002). The equations that were tested 
included GRE-V, GRE-Q, and UGPA as predictors . The findings therefore, 
suggest the use of separate selection procedures in the two departments. 

Conclusion 

The data analysis revealed that in spite of all the doubts about 
the GRE aptitude test and the problem of restriction of range, verbal 
score on GRE were the best single predictor of graduate school GPA for 
education students. Undergraduate GPA, sex, and year of enrollment did 

not increase the predictability of the criterion significantly, over and 

j 

above the prediction made by verbal scores, alone. The composite of GRE 
verbal score and undergraduate GPA was considered as the best possible 
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set of multiple predictors. 

For computer science students, non *L the predictors contributed 
significantly to prediction of the criterion, at any stage of the 
regression analysis. However, based on zero-order correlations, it was 
apparent that the undergraduate G^A was a better predictior than the GRE 
scores. 

The continued use of total scores on GRE is considered inap- 
propriate for both the groups as it underpredicted the criterion. Test 
for the equality of regression equations developed for the two groups 
suggested separate selection procedures in the two departments. 
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TABLE 1 



Sample size, X, SD antf correlation 
for Education Majors 'and Computer 
Science Majors on 4 predictors and 
the criterion 



Ma j or 












GGPA 


N 


x 


SD 




1 


— 


.41** 


.80** 


.15 


.21* 


356 


524. m 


94.59 




2 




— 


.86** 


.20* 


.11 




500.90 


140J55 


c 

o 

•r- 

+-> 


3 






— 


.22* 


.19 


• 


1024.55 


172.9^1 


<T3 

C' 
L J 


4 
5 








— 


.12 




3.02 
3.65 


OC^T) 
0.36 




1 




.26** 


.87** 


.08 


.05 


51 


604.51 


96.59 


U 


2 






.70** 


.07 


.17 




694.31 


66.67 


o 


3 








.04 


.12 




1298.82 


131.11 


t/0 

s~ 
cu 


4 










.27** 




3.20 


0.48 


£ 


5 














3.51 


0.48 


o 
o 





















o 

ERIC 



*P £ .05 
**P 4 .01 



Table About R, R 1 , R and Their Tests 
of Significance 



Major 

i 


f 

& Set 


1 

Variables 


R 


* 
R 


i 

A 

R 


Test of Overall 
Significance 
df F 


Test of Increment 
in R 

df F 


o 


i — t 

h- 
Ul 


GRE-V 

GRE-Q 

UGPA 

SEX 

YOE 


.21 
.21 
.23 
.23 
.28 ' 


.04 
.04 
.05 
.05 
.09 


.18 
.17 
.18 
.16 
.21 


1,354 15.66**' 
2,353 8.01** 
3,352 6.27** 
4,351 4.71** 
8,347 4.16** 


1,354 15.66** 
1,353 0.39 
1,352 2.71 
1,351 0.07 
4,347 3.48 


! EDUCAT; 


CM 

(_ 
UJ 

t/) 


GRE-V+Q 
UGPA 
SEX 
YOE 


.19 
.20 
.21 
.23 


.03 
.04 
.04 
.08 


.16 
.17 
.15 
.20 


1,354 12.63** 
2,353 7.5b** 
3,352 5.17** 
7,348 4.31** 


1,354 12.63** 
1,353 2.45 
1,352 0.40 
4,348 3.28 


UJ 

o 

UJ 

M 

CO 


r— < 

1— 

UJ 

in 


h UGPA 
GRE-Q 
SEX 
YOE 
GRE-V 


.27 
.31 
.35 
.48 
.48 


.07 
.10 
.12 
.23 
.23 


.13 
.06 

+ 

+ 

+ 


1,49 3.81 
2.48 2.56 
3,47 2.20 
7,43 1.84 
8,42 1.58 


1,49 3.81 
1,48 1.29 
1,47 1.44 
4,43 1.50 
1,42 0.01 


j COMPUTER 

! 


CM 

y~ 

U 

1/1 


UGPA 1 

GRE-V+Q 

SEX 

YOE 


.27 
.29 
.33 
.46 


.07 
.08 
.11 
.21 


.13 
+ 
+ 
+ 


' 1,49 3.81 
2,48 2.15 
3,47 1.95 
7,43 1.70, 


1,49 3.81 " 
1,48 0.53 
1,47 1.49 
4.43 1.45 



* P 4 .05 

** P 4 .01 

t ho 

A 

R = Estimated shrunken R based on 



Lord Nicholson formula 



